Overview

Dataset statistics

 Before imputation testAfter imputation_test
Number of variables1514
Number of observations42774277
Missing cells11230
Missing cells (%)1.8%0.0%
Duplicate rows0207
Duplicate rows (%)0.0%4.8%
Total size in memory501.3 KiB409.5 KiB
Average record size in memory120.0 B98.0 B

Variable types

 Before imputation testAfter imputation_test
Categorical44
Boolean22
Numeric98

Alerts

Before imputation testAfter imputation_test
VIP is highly imbalanced (87.2%) VIP is highly imbalanced (87.7%) Imbalance
HomePlanet has 87 (2.0%) missing values Alert not present in this datasetMissing
CryoSleep has 93 (2.2%) missing values Alert not present in this datasetMissing
Destination has 92 (2.2%) missing values Alert not present in this datasetMissing
Age has 91 (2.1%) missing values Alert not present in this datasetMissing
VIP has 93 (2.2%) missing values Alert not present in this datasetMissing
RoomService has 82 (1.9%) missing values Alert not present in this datasetMissing
FoodCourt has 106 (2.5%) missing values Alert not present in this datasetMissing
ShoppingMall has 98 (2.3%) missing values Alert not present in this datasetMissing
Spa has 101 (2.4%) missing values Alert not present in this datasetMissing
VRDeck has 80 (1.9%) missing values Alert not present in this datasetMissing
Cabin_deck has 100 (2.3%) missing values Alert not present in this datasetMissing
Cabin_side has 100 (2.3%) missing values Alert not present in this datasetMissing
Age has 82 (1.9%) zeros Age has 82 (1.9%) zeros Zeros
RoomService has 2726 (63.7%) zeros RoomService has 2753 (64.4%) zeros Zeros
FoodCourt has 2690 (62.9%) zeros FoodCourt has 2731 (63.9%) zeros Zeros
ShoppingMall has 2744 (64.2%) zeros ShoppingMall has 2782 (65.0%) zeros Zeros
Spa has 2611 (61.0%) zeros Spa has 2660 (62.2%) zeros Zeros
VRDeck has 2757 (64.5%) zeros VRDeck has 2794 (65.3%) zeros Zeros
Alert not present in this dataset Dataset has 207 (4.8%) duplicate rowsDuplicates

Reproduction

 Before imputation testAfter imputation_test
Analysis started2024-04-23 18:44:47.2130252024-04-23 18:45:02.964019
Analysis finished2024-04-23 18:45:02.9494912024-04-23 18:45:15.372434
Duration15.74 seconds12.41 seconds
Software versionydata-profiling vv4.7.0ydata-profiling vv4.7.0
Download configurationconfig.jsonconfig.json

Variables

HomePlanet
Categorical

 Before imputation testAfter imputation_test
Distinct33
Distinct (%)0.1%0.1%
Missing870
Missing (%)2.0%0.0%
Memory size33.5 KiB33.5 KiB
Earth
2263 
Europa
1002 
Mars
925 
Earth
2306 
Europa
1024 
Mars
947 

Length

 Before imputation testAfter imputation_test
Max length66
Median length55
Mean length5.01837715.0180033
Min length44

Characters and Unicode

 Before imputation testAfter imputation_test
Total characters2102721462
Distinct characters1010
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Before imputation testAfter imputation_test
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Before imputation testAfter imputation_test
1st rowEarthEarth
2nd rowEarthEarth
3rd rowEuropaEuropa
4th rowEuropaEuropa
5th rowEarthEarth

Common Values

ValueCountFrequency (%)
Earth 2263
52.9%
Europa 1002
23.4%
Mars 925
21.6%
(Missing) 87
 
2.0%
ValueCountFrequency (%)
Earth 2306
53.9%
Europa 1024
23.9%
Mars 947
22.1%

Length

2024-04-23T20:45:15.569615image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

Before imputation test

2024-04-23T20:45:15.767472image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:15.973494image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
earth 2263
54.0%
europa 1002
23.9%
mars 925
22.1%
ValueCountFrequency (%)
earth 2306
53.9%
europa 1024
23.9%
mars 947
22.1%

Most occurring characters

ValueCountFrequency (%)
a 4190
19.9%
r 4190
19.9%
E 3265
15.5%
t 2263
10.8%
h 2263
10.8%
u 1002
 
4.8%
o 1002
 
4.8%
p 1002
 
4.8%
M 925
 
4.4%
s 925
 
4.4%
ValueCountFrequency (%)
a 4277
19.9%
r 4277
19.9%
E 3330
15.5%
t 2306
10.7%
h 2306
10.7%
u 1024
 
4.8%
o 1024
 
4.8%
p 1024
 
4.8%
M 947
 
4.4%
s 947
 
4.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 21027
100.0%
ValueCountFrequency (%)
(unknown) 21462
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 4190
19.9%
r 4190
19.9%
E 3265
15.5%
t 2263
10.8%
h 2263
10.8%
u 1002
 
4.8%
o 1002
 
4.8%
p 1002
 
4.8%
M 925
 
4.4%
s 925
 
4.4%
ValueCountFrequency (%)
a 4277
19.9%
r 4277
19.9%
E 3330
15.5%
t 2306
10.7%
h 2306
10.7%
u 1024
 
4.8%
o 1024
 
4.8%
p 1024
 
4.8%
M 947
 
4.4%
s 947
 
4.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 21027
100.0%
ValueCountFrequency (%)
(unknown) 21462
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 4190
19.9%
r 4190
19.9%
E 3265
15.5%
t 2263
10.8%
h 2263
10.8%
u 1002
 
4.8%
o 1002
 
4.8%
p 1002
 
4.8%
M 925
 
4.4%
s 925
 
4.4%
ValueCountFrequency (%)
a 4277
19.9%
r 4277
19.9%
E 3330
15.5%
t 2306
10.7%
h 2306
10.7%
u 1024
 
4.8%
o 1024
 
4.8%
p 1024
 
4.8%
M 947
 
4.4%
s 947
 
4.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 21027
100.0%
ValueCountFrequency (%)
(unknown) 21462
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 4190
19.9%
r 4190
19.9%
E 3265
15.5%
t 2263
10.8%
h 2263
10.8%
u 1002
 
4.8%
o 1002
 
4.8%
p 1002
 
4.8%
M 925
 
4.4%
s 925
 
4.4%
ValueCountFrequency (%)
a 4277
19.9%
r 4277
19.9%
E 3330
15.5%
t 2306
10.7%
h 2306
10.7%
u 1024
 
4.8%
o 1024
 
4.8%
p 1024
 
4.8%
M 947
 
4.4%
s 947
 
4.4%

CryoSleep
Boolean

 Before imputation testAfter imputation_test
Distinct22
Distinct (%)< 0.1%< 0.1%
Missing930
Missing (%)2.2%0.0%
Memory size33.5 KiB4.3 KiB
False
2640 
True
1544 
(Missing)
 
93
False
2699 
True
1578 
ValueCountFrequency (%)
False 2640
61.7%
True 1544
36.1%
(Missing) 93
 
2.2%
ValueCountFrequency (%)
False 2699
63.1%
True 1578
36.9%

Before imputation test

2024-04-23T20:45:16.118240image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:16.257122image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Destination
Categorical

 Before imputation testAfter imputation_test
Distinct33
Distinct (%)0.1%0.1%
Missing920
Missing (%)2.2%0.0%
Memory size33.5 KiB33.5 KiB
TRAPPIST-1e
2956 
55 Cancri e
841 
PSO J318.5-22
388 
TRAPPIST-1e
3039 
55 Cancri e
850 
PSO J318.5-22
388 

Length

 Before imputation testAfter imputation_test
Max length1313
Median length1111
Mean length11.18542411.181436
Min length1111

Characters and Unicode

 Before imputation testAfter imputation_test
Total characters4681147823
Distinct characters2323
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Before imputation testAfter imputation_test
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Before imputation testAfter imputation_test
1st rowTRAPPIST-1eTRAPPIST-1e
2nd rowTRAPPIST-1eTRAPPIST-1e
3rd row55 Cancri e55 Cancri e
4th rowTRAPPIST-1eTRAPPIST-1e
5th rowTRAPPIST-1eTRAPPIST-1e

Common Values

ValueCountFrequency (%)
TRAPPIST-1e 2956
69.1%
55 Cancri e 841
 
19.7%
PSO J318.5-22 388
 
9.1%
(Missing) 92
 
2.2%
ValueCountFrequency (%)
TRAPPIST-1e 3039
71.1%
55 Cancri e 850
 
19.9%
PSO J318.5-22 388
 
9.1%

Length

2024-04-23T20:45:16.443757image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

Before imputation test

2024-04-23T20:45:16.664139image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:16.839395image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
trappist-1e 2956
47.3%
55 841
 
13.4%
cancri 841
 
13.4%
e 841
 
13.4%
pso 388
 
6.2%
j318.5-22 388
 
6.2%
ValueCountFrequency (%)
trappist-1e 3039
47.7%
55 850
 
13.4%
cancri 850
 
13.4%
e 850
 
13.4%
pso 388
 
6.1%
j318.5-22 388
 
6.1%

Most occurring characters

ValueCountFrequency (%)
P 6300
13.5%
T 5912
12.6%
e 3797
 
8.1%
S 3344
 
7.1%
- 3344
 
7.1%
1 3344
 
7.1%
A 2956
 
6.3%
I 2956
 
6.3%
R 2956
 
6.3%
5 2070
 
4.4%
Other values (13) 9832
21.0%
ValueCountFrequency (%)
P 6466
13.5%
T 6078
12.7%
e 3889
 
8.1%
S 3427
 
7.2%
- 3427
 
7.2%
1 3427
 
7.2%
A 3039
 
6.4%
I 3039
 
6.4%
R 3039
 
6.4%
5 2088
 
4.4%
Other values (13) 9904
20.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 46811
100.0%
ValueCountFrequency (%)
(unknown) 47823
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
P 6300
13.5%
T 5912
12.6%
e 3797
 
8.1%
S 3344
 
7.1%
- 3344
 
7.1%
1 3344
 
7.1%
A 2956
 
6.3%
I 2956
 
6.3%
R 2956
 
6.3%
5 2070
 
4.4%
Other values (13) 9832
21.0%
ValueCountFrequency (%)
P 6466
13.5%
T 6078
12.7%
e 3889
 
8.1%
S 3427
 
7.2%
- 3427
 
7.2%
1 3427
 
7.2%
A 3039
 
6.4%
I 3039
 
6.4%
R 3039
 
6.4%
5 2088
 
4.4%
Other values (13) 9904
20.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 46811
100.0%
ValueCountFrequency (%)
(unknown) 47823
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
P 6300
13.5%
T 5912
12.6%
e 3797
 
8.1%
S 3344
 
7.1%
- 3344
 
7.1%
1 3344
 
7.1%
A 2956
 
6.3%
I 2956
 
6.3%
R 2956
 
6.3%
5 2070
 
4.4%
Other values (13) 9832
21.0%
ValueCountFrequency (%)
P 6466
13.5%
T 6078
12.7%
e 3889
 
8.1%
S 3427
 
7.2%
- 3427
 
7.2%
1 3427
 
7.2%
A 3039
 
6.4%
I 3039
 
6.4%
R 3039
 
6.4%
5 2088
 
4.4%
Other values (13) 9904
20.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 46811
100.0%
ValueCountFrequency (%)
(unknown) 47823
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
P 6300
13.5%
T 5912
12.6%
e 3797
 
8.1%
S 3344
 
7.1%
- 3344
 
7.1%
1 3344
 
7.1%
A 2956
 
6.3%
I 2956
 
6.3%
R 2956
 
6.3%
5 2070
 
4.4%
Other values (13) 9832
21.0%
ValueCountFrequency (%)
P 6466
13.5%
T 6078
12.7%
e 3889
 
8.1%
S 3427
 
7.2%
- 3427
 
7.2%
1 3427
 
7.2%
A 3039
 
6.4%
I 3039
 
6.4%
R 3039
 
6.4%
5 2088
 
4.4%
Other values (13) 9904
20.7%

Age
Real number (ℝ)

 Before imputation testAfter imputation_test
Distinct79149
Distinct (%)1.9%3.5%
Missing910
Missing (%)2.1%0.0%
Infinite00
Infinite (%)0.0%0.0%
Mean28.65814628.661582
 Before imputation testAfter imputation_test
Minimum00
Maximum7979
Zeros8282
Zeros (%)1.9%1.9%
Negative00
Negative (%)0.0%0.0%
Memory size33.5 KiB33.5 KiB
2024-04-23T20:45:17.118138image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

 Before imputation testAfter imputation_test
Minimum00
5-th percentile55
Q11920
median2626
Q33737
95-th percentile5555
Maximum7979
Range7979
Interquartile range (IQR)1817

Descriptive statistics

 Before imputation testAfter imputation_test
Standard deviation14.17907214.064134
Coefficient of variation (CV)0.494765830.49069636
Kurtosis0.218522930.25978995
Mean28.65814628.661582
Median Absolute Deviation (MAD)88
Skewness0.484800290.4830656
Sum119963122585.59
Variance201.04607197.79986
MonotonicityNot monotonicNot monotonic
2024-04-23T20:45:17.490624image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
18 176
 
4.1%
22 163
 
3.8%
19 162
 
3.8%
20 160
 
3.7%
24 158
 
3.7%
21 157
 
3.7%
25 156
 
3.6%
23 144
 
3.4%
26 132
 
3.1%
27 127
 
3.0%
Other values (69) 2651
62.0%
ValueCountFrequency (%)
18 176
 
4.1%
22 163
 
3.8%
19 162
 
3.8%
20 160
 
3.7%
24 158
 
3.7%
21 157
 
3.7%
25 156
 
3.6%
23 144
 
3.4%
26 132
 
3.1%
27 127
 
3.0%
Other values (139) 2742
64.1%
ValueCountFrequency (%)
0 82
1.9%
1 27
 
0.6%
2 35
0.8%
3 34
0.8%
4 20
 
0.5%
5 20
 
0.5%
6 25
 
0.6%
7 13
 
0.3%
8 24
 
0.6%
9 21
 
0.5%
ValueCountFrequency (%)
0 82
1.9%
1 27
 
0.6%
2 35
0.8%
3 34
0.8%
4 20
 
0.5%
5 20
 
0.5%
6 25
 
0.6%
7 13
 
0.3%
8 24
 
0.6%
8.451165929 1
 
< 0.1%
ValueCountFrequency (%)
0 82
1.9%
1 27
 
0.6%
2 35
0.8%
3 34
0.8%
4 20
 
0.5%
5 20
 
0.5%
6 25
 
0.6%
7 13
 
0.3%
8 24
 
0.6%
8.451165929 1
 
< 0.1%
ValueCountFrequency (%)
0 82
1.9%
1 27
 
0.6%
2 35
0.8%
3 34
0.8%
4 20
 
0.5%
5 20
 
0.5%
6 25
 
0.6%
7 13
 
0.3%
8 24
 
0.6%
9 21
 
0.5%

VIP
Boolean

 Before imputation testAfter imputation_test
Distinct22
Distinct (%)< 0.1%< 0.1%
Missing930
Missing (%)2.2%0.0%
Memory size33.5 KiB4.3 KiB
False
4110 
True
 
74
(Missing)
 
93
False
4205 
True
 
72
ValueCountFrequency (%)
False 4110
96.1%
True 74
 
1.7%
(Missing) 93
 
2.2%
ValueCountFrequency (%)
False 4205
98.3%
True 72
 
1.7%

Before imputation test

2024-04-23T20:45:17.727494image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:17.923812image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

RoomService
Real number (ℝ)

 Before imputation testAfter imputation_test
Distinct842897
Distinct (%)20.1%21.0%
Missing820
Missing (%)1.9%0.0%
Infinite00
Infinite (%)0.0%0.0%
Mean219.26627219.56069
 Before imputation testAfter imputation_test
Minimum0-75.164219
Maximum1156711567
Zeros27262753
Zeros (%)63.7%64.4%
Negative02
Negative (%)0.0%< 0.1%
Memory size33.5 KiB33.5 KiB
2024-04-23T20:45:18.306667image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

 Before imputation testAfter imputation_test
Minimum0-75.164219
5-th percentile00
Q100
median00
Q35360
95-th percentile1274.51274
Maximum1156711567
Range1156711642.164
Interquartile range (IQR)5360

Descriptive statistics

 Before imputation testAfter imputation_test
Standard deviation607.01129603.38079
Coefficient of variation (CV)2.76837512.7481276
Kurtosis53.21626853.486683
Mean219.26627219.56069
Median Absolute Deviation (MAD)00
Skewness5.55838975.5597325
Sum919822939061.08
Variance368462.7364068.38
MonotonicityNot monotonicNot monotonic
2024-04-23T20:45:18.677809image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2726
63.7%
1 68
 
1.6%
2 34
 
0.8%
3 28
 
0.7%
4 24
 
0.6%
6 16
 
0.4%
5 15
 
0.4%
9 13
 
0.3%
8 12
 
0.3%
13 11
 
0.3%
Other values (832) 1248
29.2%
(Missing) 82
 
1.9%
ValueCountFrequency (%)
0 2753
64.4%
1 68
 
1.6%
2 34
 
0.8%
3 28
 
0.7%
4 24
 
0.6%
6 16
 
0.4%
5 15
 
0.4%
9 13
 
0.3%
8 12
 
0.3%
13 11
 
0.3%
Other values (887) 1303
30.5%
ValueCountFrequency (%)
0 2726
63.7%
1 68
 
1.6%
2 34
 
0.8%
3 28
 
0.7%
4 24
 
0.6%
5 15
 
0.4%
6 16
 
0.4%
7 8
 
0.2%
8 12
 
0.3%
9 13
 
0.3%
ValueCountFrequency (%)
-75.16421948 1
 
< 0.1%
-37.74779523 1
 
< 0.1%
0 2753
64.4%
1 68
 
1.6%
2 34
 
0.8%
3 28
 
0.7%
4 24
 
0.6%
5 15
 
0.4%
6 16
 
0.4%
7 8
 
0.2%
ValueCountFrequency (%)
-75.16421948 1
 
< 0.1%
-37.74779523 1
 
< 0.1%
0 2753
64.4%
1 68
 
1.6%
2 34
 
0.8%
3 28
 
0.7%
4 24
 
0.6%
5 15
 
0.4%
6 16
 
0.4%
7 8
 
0.2%
ValueCountFrequency (%)
0 2726
63.7%
1 68
 
1.6%
2 34
 
0.8%
3 28
 
0.7%
4 24
 
0.6%
5 15
 
0.4%
6 16
 
0.4%
7 8
 
0.2%
8 12
 
0.3%
9 13
 
0.3%

FoodCourt
Real number (ℝ)

 Before imputation testAfter imputation_test
Distinct902967
Distinct (%)21.6%22.6%
Missing1060
Missing (%)2.5%0.0%
Infinite00
Infinite (%)0.0%0.0%
Mean439.4843438.12439
 Before imputation testAfter imputation_test
Minimum0-80.182766
Maximum2527325273
Zeros26902731
Zeros (%)62.9%63.9%
Negative06
Negative (%)0.0%0.1%
Memory size33.5 KiB33.5 KiB
2024-04-23T20:45:18.977160image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

 Before imputation testAfter imputation_test
Minimum0-80.182766
5-th percentile00
Q100
median00
Q37887
95-th percentile2518.52570.4
Maximum2527325273
Range2527325353.183
Interquartile range (IQR)7887

Descriptive statistics

 Before imputation testAfter imputation_test
Standard deviation1527.6631516.1588
Coefficient of variation (CV)3.47603563.4605671
Kurtosis67.76443468.195785
Mean439.4843438.12439
Median Absolute Deviation (MAD)00
Skewness6.91062546.915053
Sum18330891873858
Variance2333754.42298737.6
MonotonicityNot monotonicNot monotonic
2024-04-23T20:45:19.283664image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2690
62.9%
1 59
 
1.4%
2 30
 
0.7%
4 22
 
0.5%
3 21
 
0.5%
6 20
 
0.5%
5 19
 
0.4%
7 13
 
0.3%
11 12
 
0.3%
10 12
 
0.3%
Other values (892) 1273
29.8%
(Missing) 106
 
2.5%
ValueCountFrequency (%)
0 2731
63.9%
1 59
 
1.4%
2 30
 
0.7%
4 22
 
0.5%
3 21
 
0.5%
6 20
 
0.5%
5 19
 
0.4%
7 13
 
0.3%
10 12
 
0.3%
11 12
 
0.3%
Other values (957) 1338
31.3%
ValueCountFrequency (%)
0 2690
62.9%
1 59
 
1.4%
2 30
 
0.7%
3 21
 
0.5%
4 22
 
0.5%
5 19
 
0.4%
6 20
 
0.5%
7 13
 
0.3%
8 11
 
0.3%
9 8
 
0.2%
ValueCountFrequency (%)
-80.18276583 1
 
< 0.1%
-42.69824711 1
 
< 0.1%
-33.12278457 1
 
< 0.1%
-29.61800784 1
 
< 0.1%
-25.51534644 1
 
< 0.1%
-19.12021623 1
 
< 0.1%
0 2731
63.9%
1 59
 
1.4%
2 30
 
0.7%
3 21
 
0.5%
ValueCountFrequency (%)
-80.18276583 1
 
< 0.1%
-42.69824711 1
 
< 0.1%
-33.12278457 1
 
< 0.1%
-29.61800784 1
 
< 0.1%
-25.51534644 1
 
< 0.1%
-19.12021623 1
 
< 0.1%
0 2731
63.9%
1 59
 
1.4%
2 30
 
0.7%
3 21
 
0.5%
ValueCountFrequency (%)
0 2690
62.9%
1 59
 
1.4%
2 30
 
0.7%
3 21
 
0.5%
4 22
 
0.5%
5 19
 
0.4%
6 20
 
0.5%
7 13
 
0.3%
8 11
 
0.3%
9 8
 
0.2%

ShoppingMall
Real number (ℝ)

 Before imputation testAfter imputation_test
Distinct715775
Distinct (%)17.1%18.1%
Missing980
Missing (%)2.3%0.0%
Infinite00
Infinite (%)0.0%0.0%
Mean177.29553177.85586
 Before imputation testAfter imputation_test
Minimum0-47.093919
Maximum82928292
Zeros27442782
Zeros (%)64.2%65.0%
Negative02
Negative (%)0.0%< 0.1%
Memory size33.5 KiB33.5 KiB
2024-04-23T20:45:19.600569image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

 Before imputation testAfter imputation_test
Minimum0-47.093919
5-th percentile00
Q100
median00
Q33339
95-th percentile994.1993.2
Maximum82928292
Range82928339.0939
Interquartile range (IQR)3339

Descriptive statistics

 Before imputation testAfter imputation_test
Standard deviation560.82112556.66316
Coefficient of variation (CV)3.16319953.1298556
Kurtosis68.22114268.704805
Mean177.29553177.85586
Median Absolute Deviation (MAD)00
Skewness6.82493916.8313261
Sum740918760689.53
Variance314520.33309873.88
MonotonicityNot monotonicNot monotonic
2024-04-23T20:45:19.945718image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2744
64.2%
1 72
 
1.7%
3 35
 
0.8%
2 32
 
0.7%
4 24
 
0.6%
7 19
 
0.4%
9 17
 
0.4%
8 16
 
0.4%
12 13
 
0.3%
10 12
 
0.3%
Other values (705) 1195
27.9%
(Missing) 98
 
2.3%
ValueCountFrequency (%)
0 2782
65.0%
1 72
 
1.7%
3 35
 
0.8%
2 32
 
0.7%
4 24
 
0.6%
7 19
 
0.4%
9 17
 
0.4%
8 16
 
0.4%
12 13
 
0.3%
20 12
 
0.3%
Other values (765) 1255
29.3%
ValueCountFrequency (%)
0 2744
64.2%
1 72
 
1.7%
2 32
 
0.7%
3 35
 
0.8%
4 24
 
0.6%
5 11
 
0.3%
6 12
 
0.3%
7 19
 
0.4%
8 16
 
0.4%
9 17
 
0.4%
ValueCountFrequency (%)
-47.09391947 1
 
< 0.1%
-34.40882954 1
 
< 0.1%
0 2782
65.0%
1 72
 
1.7%
2 32
 
0.7%
3 35
 
0.8%
4 24
 
0.6%
5 11
 
0.3%
5.815686241 1
 
< 0.1%
6 12
 
0.3%
ValueCountFrequency (%)
-47.09391947 1
 
< 0.1%
-34.40882954 1
 
< 0.1%
0 2782
65.0%
1 72
 
1.7%
2 32
 
0.7%
3 35
 
0.8%
4 24
 
0.6%
5 11
 
0.3%
5.815686241 1
 
< 0.1%
6 12
 
0.3%
ValueCountFrequency (%)
0 2744
64.2%
1 72
 
1.7%
2 32
 
0.7%
3 35
 
0.8%
4 24
 
0.6%
5 11
 
0.3%
6 12
 
0.3%
7 19
 
0.4%
8 16
 
0.4%
9 17
 
0.4%

Spa
Real number (ℝ)

 Before imputation testAfter imputation_test
Distinct833885
Distinct (%)19.9%20.7%
Missing1010
Missing (%)2.4%0.0%
Infinite00
Infinite (%)0.0%0.0%
Mean303.05244301.296
 Before imputation testAfter imputation_test
Minimum0-413.44814
Maximum1984419844
Zeros26112660
Zeros (%)61.0%62.2%
Negative07
Negative (%)0.0%0.2%
Memory size33.5 KiB33.5 KiB
2024-04-23T20:45:20.269865image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

 Before imputation testAfter imputation_test
Minimum0-413.44814
5-th percentile00
Q100
median00
Q35056
95-th percentile15251520.6
Maximum1984419844
Range1984420257.448
Interquartile range (IQR)5056

Descriptive statistics

 Before imputation testAfter imputation_test
Standard deviation1117.1861107.9852
Coefficient of variation (CV)3.68644453.6773977
Kurtosis80.46040281.339495
Mean303.05244301.296
Median Absolute Deviation (MAD)00
Skewness7.69029797.7192546
Sum12655471288643
Variance1248104.61227631.2
MonotonicityNot monotonicNot monotonic
2024-04-23T20:45:20.582609image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2611
61.0%
1 72
 
1.7%
2 43
 
1.0%
3 29
 
0.7%
4 27
 
0.6%
6 23
 
0.5%
8 22
 
0.5%
7 19
 
0.4%
5 16
 
0.4%
9 16
 
0.4%
Other values (823) 1298
30.3%
(Missing) 101
 
2.4%
ValueCountFrequency (%)
0 2660
62.2%
1 72
 
1.7%
2 43
 
1.0%
3 29
 
0.7%
4 27
 
0.6%
6 23
 
0.5%
8 22
 
0.5%
7 19
 
0.4%
9 16
 
0.4%
5 16
 
0.4%
Other values (875) 1350
31.6%
ValueCountFrequency (%)
0 2611
61.0%
1 72
 
1.7%
2 43
 
1.0%
3 29
 
0.7%
4 27
 
0.6%
5 16
 
0.4%
6 23
 
0.5%
7 19
 
0.4%
8 22
 
0.5%
9 16
 
0.4%
ValueCountFrequency (%)
-413.4481391 1
 
< 0.1%
-85.55296547 1
 
< 0.1%
-84.28616052 1
 
< 0.1%
-55.48958133 1
 
< 0.1%
-37.71264375 1
 
< 0.1%
-30.44507366 1
 
< 0.1%
-22.02837509 1
 
< 0.1%
0 2660
62.2%
0.7608587164 1
 
< 0.1%
1 72
 
1.7%
ValueCountFrequency (%)
-413.4481391 1
 
< 0.1%
-85.55296547 1
 
< 0.1%
-84.28616052 1
 
< 0.1%
-55.48958133 1
 
< 0.1%
-37.71264375 1
 
< 0.1%
-30.44507366 1
 
< 0.1%
-22.02837509 1
 
< 0.1%
0 2660
62.2%
0.7608587164 1
 
< 0.1%
1 72
 
1.7%
ValueCountFrequency (%)
0 2611
61.0%
1 72
 
1.7%
2 43
 
1.0%
3 29
 
0.7%
4 27
 
0.6%
5 16
 
0.4%
6 23
 
0.5%
7 19
 
0.4%
8 22
 
0.5%
9 16
 
0.4%

VRDeck
Real number (ℝ)

 Before imputation testAfter imputation_test
Distinct796839
Distinct (%)19.0%19.6%
Missing800
Missing (%)1.9%0.0%
Infinite00
Infinite (%)0.0%0.0%
Mean310.71003309.25262
 Before imputation testAfter imputation_test
Minimum0-222.47406
Maximum2227222272
Zeros27572794
Zeros (%)64.5%65.3%
Negative04
Negative (%)0.0%0.1%
Memory size33.5 KiB33.5 KiB
2024-04-23T20:45:20.912472image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

 Before imputation testAfter imputation_test
Minimum0-222.47406
5-th percentile00
Q100
median00
Q33640.741895
95-th percentile1536.81518.2
Maximum2227222272
Range2227222494.474
Interquartile range (IQR)3640.741895

Descriptive statistics

 Before imputation testAfter imputation_test
Standard deviation1246.99471237.4421
Coefficient of variation (CV)4.01337144.0013958
Kurtosis93.84239895.04692
Mean310.71003309.25262
Median Absolute Deviation (MAD)00
Skewness8.387218.4306521
Sum13040501322673.4
Variance1554995.91531263
MonotonicityNot monotonicNot monotonic
2024-04-23T20:45:21.208753image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2757
64.5%
1 72
 
1.7%
2 38
 
0.9%
3 33
 
0.8%
7 23
 
0.5%
6 21
 
0.5%
4 20
 
0.5%
5 17
 
0.4%
8 10
 
0.2%
19 10
 
0.2%
Other values (786) 1196
28.0%
(Missing) 80
 
1.9%
ValueCountFrequency (%)
0 2794
65.3%
1 72
 
1.7%
2 38
 
0.9%
3 33
 
0.8%
7 23
 
0.5%
6 21
 
0.5%
4 20
 
0.5%
5 17
 
0.4%
8 10
 
0.2%
32 10
 
0.2%
Other values (829) 1239
29.0%
ValueCountFrequency (%)
0 2757
64.5%
1 72
 
1.7%
2 38
 
0.9%
3 33
 
0.8%
4 20
 
0.5%
5 17
 
0.4%
6 21
 
0.5%
7 23
 
0.5%
8 10
 
0.2%
9 9
 
0.2%
ValueCountFrequency (%)
-222.4740579 1
 
< 0.1%
-133.6016053 1
 
< 0.1%
-48.2432044 1
 
< 0.1%
-16.02839378 1
 
< 0.1%
0 2794
65.3%
1 72
 
1.7%
1.946862794 1
 
< 0.1%
2 38
 
0.9%
3 33
 
0.8%
4 20
 
0.5%
ValueCountFrequency (%)
-222.4740579 1
 
< 0.1%
-133.6016053 1
 
< 0.1%
-48.2432044 1
 
< 0.1%
-16.02839378 1
 
< 0.1%
0 2794
65.3%
1 72
 
1.7%
1.946862794 1
 
< 0.1%
2 38
 
0.9%
3 33
 
0.8%
4 20
 
0.5%
ValueCountFrequency (%)
0 2757
64.5%
1 72
 
1.7%
2 38
 
0.9%
3 33
 
0.8%
4 20
 
0.5%
5 17
 
0.4%
6 21
 
0.5%
7 23
 
0.5%
8 10
 
0.2%
9 9
 
0.2%

Cabin_deck
Categorical

 Before imputation testAfter imputation_test
Distinct88
Distinct (%)0.2%0.2%
Missing1000
Missing (%)2.3%0.0%
Memory size33.5 KiB33.5 KiB
F
1445 
G
1222 
E
447 
B
362 
C
355 
Other values (3)
346 
F
1497 
G
1248 
E
450 
B
373 
C
360 
Other values (3)
349 

Length

 Before imputation testAfter imputation_test
Max length11
Median length11
Mean length11
Min length11

Characters and Unicode

 Before imputation testAfter imputation_test
Total characters41774277
Distinct characters88
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Before imputation testAfter imputation_test
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Before imputation testAfter imputation_test
1st rowGG
2nd rowFF
3rd rowCC
4th rowCC
5th rowFF

Common Values

ValueCountFrequency (%)
F 1445
33.8%
G 1222
28.6%
E 447
 
10.5%
B 362
 
8.5%
C 355
 
8.3%
D 242
 
5.7%
A 98
 
2.3%
T 6
 
0.1%
(Missing) 100
 
2.3%
ValueCountFrequency (%)
F 1497
35.0%
G 1248
29.2%
E 450
 
10.5%
B 373
 
8.7%
C 360
 
8.4%
D 244
 
5.7%
A 99
 
2.3%
T 6
 
0.1%

Length

2024-04-23T20:45:21.411761image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

Before imputation test

2024-04-23T20:45:21.611070image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:21.859764image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
f 1445
34.6%
g 1222
29.3%
e 447
 
10.7%
b 362
 
8.7%
c 355
 
8.5%
d 242
 
5.8%
a 98
 
2.3%
t 6
 
0.1%
ValueCountFrequency (%)
f 1497
35.0%
g 1248
29.2%
e 450
 
10.5%
b 373
 
8.7%
c 360
 
8.4%
d 244
 
5.7%
a 99
 
2.3%
t 6
 
0.1%

Most occurring characters

ValueCountFrequency (%)
F 1445
34.6%
G 1222
29.3%
E 447
 
10.7%
B 362
 
8.7%
C 355
 
8.5%
D 242
 
5.8%
A 98
 
2.3%
T 6
 
0.1%
ValueCountFrequency (%)
F 1497
35.0%
G 1248
29.2%
E 450
 
10.5%
B 373
 
8.7%
C 360
 
8.4%
D 244
 
5.7%
A 99
 
2.3%
T 6
 
0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 4177
100.0%
ValueCountFrequency (%)
(unknown) 4277
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
F 1445
34.6%
G 1222
29.3%
E 447
 
10.7%
B 362
 
8.7%
C 355
 
8.5%
D 242
 
5.8%
A 98
 
2.3%
T 6
 
0.1%
ValueCountFrequency (%)
F 1497
35.0%
G 1248
29.2%
E 450
 
10.5%
B 373
 
8.7%
C 360
 
8.4%
D 244
 
5.7%
A 99
 
2.3%
T 6
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 4177
100.0%
ValueCountFrequency (%)
(unknown) 4277
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
F 1445
34.6%
G 1222
29.3%
E 447
 
10.7%
B 362
 
8.7%
C 355
 
8.5%
D 242
 
5.8%
A 98
 
2.3%
T 6
 
0.1%
ValueCountFrequency (%)
F 1497
35.0%
G 1248
29.2%
E 450
 
10.5%
B 373
 
8.7%
C 360
 
8.4%
D 244
 
5.7%
A 99
 
2.3%
T 6
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 4177
100.0%
ValueCountFrequency (%)
(unknown) 4277
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
F 1445
34.6%
G 1222
29.3%
E 447
 
10.7%
B 362
 
8.7%
C 355
 
8.5%
D 242
 
5.8%
A 98
 
2.3%
T 6
 
0.1%
ValueCountFrequency (%)
F 1497
35.0%
G 1248
29.2%
E 450
 
10.5%
B 373
 
8.7%
C 360
 
8.4%
D 244
 
5.7%
A 99
 
2.3%
T 6
 
0.1%

Cabin_side
Categorical

 Before imputation testAfter imputation_test
Distinct22
Distinct (%)< 0.1%< 0.1%
Missing1000
Missing (%)2.3%0.0%
Memory size33.5 KiB33.5 KiB
S
2093 
P
2084 
S
2141 
P
2136 

Length

 Before imputation testAfter imputation_test
Max length11
Median length11
Mean length11
Min length11

Characters and Unicode

 Before imputation testAfter imputation_test
Total characters41774277
Distinct characters22
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Before imputation testAfter imputation_test
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Before imputation testAfter imputation_test
1st rowSS
2nd rowSS
3rd rowSS
4th rowSS
5th rowSS

Common Values

ValueCountFrequency (%)
S 2093
48.9%
P 2084
48.7%
(Missing) 100
 
2.3%
ValueCountFrequency (%)
S 2141
50.1%
P 2136
49.9%

Length

2024-04-23T20:45:22.081670image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

Before imputation test

2024-04-23T20:45:22.237165image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:22.398567image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
s 2093
50.1%
p 2084
49.9%
ValueCountFrequency (%)
s 2141
50.1%
p 2136
49.9%

Most occurring characters

ValueCountFrequency (%)
S 2093
50.1%
P 2084
49.9%
ValueCountFrequency (%)
S 2141
50.1%
P 2136
49.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 4177
100.0%
ValueCountFrequency (%)
(unknown) 4277
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
S 2093
50.1%
P 2084
49.9%
ValueCountFrequency (%)
S 2141
50.1%
P 2136
49.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 4177
100.0%
ValueCountFrequency (%)
(unknown) 4277
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
S 2093
50.1%
P 2084
49.9%
ValueCountFrequency (%)
S 2141
50.1%
P 2136
49.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 4177
100.0%
ValueCountFrequency (%)
(unknown) 4277
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
S 2093
50.1%
P 2084
49.9%
ValueCountFrequency (%)
S 2141
50.1%
P 2136
49.9%

ID_group
Real number (ℝ)

Distinct3063
Distinct (%)71.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4639.2965
Minimum13
Maximum9277
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size33.5 KiB
2024-04-23T20:45:22.597436image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum13
5-th percentile468.6
Q12249
median4639
Q37030
95-th percentile8840.4
Maximum9277
Range9264
Interquartile range (IQR)4781

Descriptive statistics

Standard deviation2716.1974
Coefficient of variation (CV)0.58547614
Kurtosis-1.2392784
Mean4639.2965
Median Absolute Deviation (MAD)2391
Skewness0.0010577841
Sum19842271
Variance7377728.1
MonotonicityIncreasing
2024-04-23T20:45:22.835967image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6332 8
 
0.2%
6499 8
 
0.2%
6986 8
 
0.2%
8543 8
 
0.2%
1072 8
 
0.2%
339 8
 
0.2%
8980 7
 
0.2%
717 7
 
0.2%
9238 7
 
0.2%
1354 7
 
0.2%
Other values (3053) 4201
98.2%
ValueCountFrequency (%)
13 1
< 0.1%
18 1
< 0.1%
19 1
< 0.1%
21 1
< 0.1%
23 1
< 0.1%
27 1
< 0.1%
29 1
< 0.1%
32 2
< 0.1%
33 1
< 0.1%
37 1
< 0.1%
ValueCountFrequency (%)
9277 1
< 0.1%
9273 1
< 0.1%
9271 1
< 0.1%
9269 1
< 0.1%
9266 2
< 0.1%
9265 1
< 0.1%
9263 1
< 0.1%
9262 1
< 0.1%
9260 1
< 0.1%
9258 1
< 0.1%

ID_num
Real number (ℝ)

 Before imputation testAfter imputation_test
Distinct88
Distinct (%)0.2%0.2%
Missing00
Missing (%)0.0%0.0%
Infinite00
Infinite (%)0.0%0.0%
Mean1.49871411.4987141
 Before imputation testAfter imputation_test
Minimum11
Maximum88
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size33.5 KiB33.5 KiB
2024-04-23T20:45:23.017723image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

 Before imputation testAfter imputation_test
Minimum11
5-th percentile11
Q111
median11
Q322
95-th percentile44
Maximum88
Range77
Interquartile range (IQR)11

Descriptive statistics

 Before imputation testAfter imputation_test
Standard deviation1.01822071.0182207
Coefficient of variation (CV)0.679396240.67939624
Kurtosis9.37103519.3710351
Mean1.49871411.4987141
Median Absolute Deviation (MAD)00
Skewness2.81205322.8120532
Sum64106410
Variance1.03677341.0367734
MonotonicityNot monotonicNot monotonic
2024-04-23T20:45:23.202521image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
1 3063
71.6%
2 723
 
16.9%
3 269
 
6.3%
4 107
 
2.5%
5 56
 
1.3%
6 33
 
0.8%
7 20
 
0.5%
8 6
 
0.1%
ValueCountFrequency (%)
1 3063
71.6%
2 723
 
16.9%
3 269
 
6.3%
4 107
 
2.5%
5 56
 
1.3%
6 33
 
0.8%
7 20
 
0.5%
8 6
 
0.1%
ValueCountFrequency (%)
1 3063
71.6%
2 723
 
16.9%
3 269
 
6.3%
4 107
 
2.5%
5 56
 
1.3%
6 33
 
0.8%
7 20
 
0.5%
8 6
 
0.1%
ValueCountFrequency (%)
1 3063
71.6%
2 723
 
16.9%
3 269
 
6.3%
4 107
 
2.5%
5 56
 
1.3%
6 33
 
0.8%
7 20
 
0.5%
8 6
 
0.1%
ValueCountFrequency (%)
1 3063
71.6%
2 723
 
16.9%
3 269
 
6.3%
4 107
 
2.5%
5 56
 
1.3%
6 33
 
0.8%
7 20
 
0.5%
8 6
 
0.1%
ValueCountFrequency (%)
1 3063
71.6%
2 723
 
16.9%
3 269
 
6.3%
4 107
 
2.5%
5 56
 
1.3%
6 33
 
0.8%
7 20
 
0.5%
8 6
 
0.1%

Group_size
Real number (ℝ)

 Before imputation testAfter imputation_test
Distinct88
Distinct (%)0.2%0.2%
Missing00
Missing (%)0.0%0.0%
Infinite00
Infinite (%)0.0%0.0%
Mean1.99742811.9974281
 Before imputation testAfter imputation_test
Minimum11
Maximum88
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size33.5 KiB33.5 KiB
2024-04-23T20:45:23.375618image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

 Before imputation testAfter imputation_test
Minimum11
5-th percentile11
Q111
median11
Q322
95-th percentile66
Maximum88
Range77
Interquartile range (IQR)11

Descriptive statistics

 Before imputation testAfter imputation_test
Standard deviation1.53711271.5371127
Coefficient of variation (CV)0.769545960.76954596
Kurtosis3.67429013.6742901
Mean1.99742811.9974281
Median Absolute Deviation (MAD)00
Skewness1.96950561.9695056
Sum85438543
Variance2.36271562.3627156
MonotonicityNot monotonicNot monotonic
2024-04-23T20:45:23.552926image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
1 2340
54.7%
2 908
 
21.2%
3 486
 
11.4%
4 204
 
4.8%
5 115
 
2.7%
7 98
 
2.3%
6 78
 
1.8%
8 48
 
1.1%
ValueCountFrequency (%)
1 2340
54.7%
2 908
 
21.2%
3 486
 
11.4%
4 204
 
4.8%
5 115
 
2.7%
7 98
 
2.3%
6 78
 
1.8%
8 48
 
1.1%
ValueCountFrequency (%)
1 2340
54.7%
2 908
 
21.2%
3 486
 
11.4%
4 204
 
4.8%
5 115
 
2.7%
6 78
 
1.8%
7 98
 
2.3%
8 48
 
1.1%
ValueCountFrequency (%)
1 2340
54.7%
2 908
 
21.2%
3 486
 
11.4%
4 204
 
4.8%
5 115
 
2.7%
6 78
 
1.8%
7 98
 
2.3%
8 48
 
1.1%
ValueCountFrequency (%)
1 2340
54.7%
2 908
 
21.2%
3 486
 
11.4%
4 204
 
4.8%
5 115
 
2.7%
6 78
 
1.8%
7 98
 
2.3%
8 48
 
1.1%
ValueCountFrequency (%)
1 2340
54.7%
2 908
 
21.2%
3 486
 
11.4%
4 204
 
4.8%
5 115
 
2.7%
6 78
 
1.8%
7 98
 
2.3%
8 48
 
1.1%

Interactions

Before imputation test

2024-04-23T20:45:00.312915image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:13.079317image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:47.766664image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:03.299182image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:49.312065image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:04.848256image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:50.900769image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:06.228892image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:52.447122image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:07.623942image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:54.161609image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:08.905951image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:55.812671image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:10.336067image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:57.279196image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test


Interaction plot not present for dataset

Before imputation test

2024-04-23T20:44:58.722889image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:11.651674image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:45:00.530938image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:13.268025image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:47.953020image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:03.535069image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:49.516413image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:05.034719image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:51.079811image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:06.403474image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:52.640754image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:07.780488image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:54.352466image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:09.071390image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:55.997866image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:10.516355image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:57.472483image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test


Interaction plot not present for dataset

Before imputation test

2024-04-23T20:44:58.929696image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:11.821596image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:45:00.686084image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:13.419839image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:48.107129image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:03.726982image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:49.677098image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:05.194293image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:51.282183image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:06.573670image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:52.863348image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:07.948738image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:54.523391image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:09.256001image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:56.143150image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:10.710849image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:57.673093image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test


Interaction plot not present for dataset

Before imputation test

2024-04-23T20:44:59.091670image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:11.985529image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:45:00.840394image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:13.582991image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:48.300809image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:03.917483image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:49.836093image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:05.361490image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:51.430029image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:06.744596image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:53.068519image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:08.099580image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:54.670437image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:09.422991image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:56.272848image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:10.846669image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:57.836985image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test


Interaction plot not present for dataset

Before imputation test

2024-04-23T20:44:59.275529image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:12.146410image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:45:00.985653image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:13.764024image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:48.440912image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:04.083827image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:49.972093image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:05.527980image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:51.599765image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:06.922142image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:53.212789image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:08.249820image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:54.893698image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:09.584758image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:56.427728image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:10.984372image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:57.974943image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test


Interaction plot not present for dataset

Before imputation test

2024-04-23T20:44:59.448010image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:12.318038image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:45:01.166412image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:14.199337image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:48.631430image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:04.261202image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:50.185598image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:05.707604image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:51.793127image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:07.091769image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:53.369446image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:08.436084image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:55.110569image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:09.794798image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:56.620475image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:11.159401image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:58.139960image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test


Interaction plot not present for dataset

Before imputation test

2024-04-23T20:44:59.658998image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:12.507893image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:45:01.352746image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test


Interaction plot not present for dataset

Before imputation test

2024-04-23T20:44:48.806811image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test


Interaction plot not present for dataset

Before imputation test

2024-04-23T20:44:50.352183image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test


Interaction plot not present for dataset

Before imputation test

2024-04-23T20:44:51.937330image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test


Interaction plot not present for dataset

Before imputation test

2024-04-23T20:44:53.548033image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test


Interaction plot not present for dataset

Before imputation test

2024-04-23T20:44:55.271782image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test


Interaction plot not present for dataset

Before imputation test

2024-04-23T20:44:56.772688image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test


Interaction plot not present for dataset

Before imputation test

2024-04-23T20:44:58.277259image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test


Interaction plot not present for dataset

Before imputation test

2024-04-23T20:44:59.805010image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test


Interaction plot not present for dataset

Before imputation test

2024-04-23T20:45:01.545755image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:14.341219image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:48.964943image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:04.409949image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:50.555234image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:05.855578image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:52.091957image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:07.247857image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:53.701703image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:08.573486image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:55.445093image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:09.960000image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:56.926131image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:11.300447image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:58.417262image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test


Interaction plot not present for dataset

Before imputation test

2024-04-23T20:44:59.957555image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:12.686305image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:45:01.755563image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:14.500579image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:49.140980image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:04.631994image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:50.715957image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:06.036891image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:52.258215image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:07.430184image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:53.852539image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:08.754869image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:55.647679image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:10.150264image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:57.092343image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:11.474993image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Before imputation test

2024-04-23T20:44:58.580147image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test


Interaction plot not present for dataset

Before imputation test

2024-04-23T20:45:00.137109image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

After imputation_test

2024-04-23T20:45:12.896786image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Missing values

Before imputation test

2024-04-23T20:45:02.020979image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
A simple visualization of nullity by column.

After imputation_test

2024-04-23T20:45:14.819961image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
A simple visualization of nullity by column.

Before imputation test

2024-04-23T20:45:02.397131image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

After imputation_test

2024-04-23T20:45:15.223761image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

Before imputation test

HomePlanetCryoSleepDestinationAgeVIPRoomServiceFoodCourtShoppingMallSpaVRDeckCabin_deckCabin_sideID_groupID_numGroup_size
0EarthTrueTRAPPIST-1e27.0False0.00.00.00.00.0GS1311
1EarthFalseTRAPPIST-1e19.0False0.09.00.02823.00.0FS1811
2EuropaTrue55 Cancri e31.0False0.00.00.00.00.0CS1911
3EuropaFalseTRAPPIST-1e38.0False0.06652.00.0181.0585.0CS2111
4EarthFalseTRAPPIST-1e20.0False10.00.0635.00.00.0FS2311
5EarthFalseTRAPPIST-1e31.0False0.01615.0263.0113.060.0FP2711
6EuropaTrue55 Cancri e21.0False0.0NaN0.00.00.0BP2911
7EuropaTrueTRAPPIST-1e20.0False0.00.00.00.00.0DS3212
8EuropaTrue55 Cancri e23.0False0.00.00.00.00.0DS3222
9EarthFalse55 Cancri e24.0False0.0639.00.00.00.0FS3311

After imputation_test

HomePlanetCryoSleepDestinationAgeVIPRoomServiceFoodCourtShoppingMallSpaVRDeckCabin_deckCabin_sideID_numGroup_size
0EarthTrueTRAPPIST-1e27.0False0.00.00.00.00.0GS11
1EarthFalseTRAPPIST-1e19.0False0.09.00.02823.00.0FS11
2EuropaTrue55 Cancri e31.0False0.00.00.00.00.0CS11
3EuropaFalseTRAPPIST-1e38.0False0.06652.00.0181.0585.0CS11
4EarthFalseTRAPPIST-1e20.0False10.00.0635.00.00.0FS11
5EarthFalseTRAPPIST-1e31.0False0.01615.0263.0113.060.0FP11
6EuropaTrue55 Cancri e21.0False0.00.00.00.00.0BP11
7EuropaTrueTRAPPIST-1e20.0False0.00.00.00.00.0DS12
8EuropaTrue55 Cancri e23.0False0.00.00.00.00.0DS22
9EarthFalse55 Cancri e24.0False0.0639.00.00.00.0FS11

Before imputation test

HomePlanetCryoSleepDestinationAgeVIPRoomServiceFoodCourtShoppingMallSpaVRDeckCabin_deckCabin_sideID_groupID_numGroup_size
4267EarthTrue55 Cancri e3.0NaN0.00.00.00.00.0GP926011
4268EarthFalse55 Cancri e20.0False0.0601.0103.035.00.0FS926211
4269EarthTrueTRAPPIST-1e43.0False0.00.00.00.00.0GS926311
4270MarsFalseTRAPPIST-1e43.0False47.00.03851.00.00.0DS926511
4271EarthFalseTRAPPIST-1e40.0False0.0865.00.03.00.0FS926612
4272EarthTrueTRAPPIST-1e34.0False0.00.00.00.00.0GS926622
4273EarthFalseTRAPPIST-1e42.0False0.0847.017.010.0144.0NaNNaN926911
4274MarsTrue55 Cancri eNaNFalse0.00.00.00.00.0DP927111
4275EuropaFalseNaNNaNFalse0.02680.00.00.0523.0DP927311
4276EarthTruePSO J318.5-2243.0False0.00.00.00.00.0GS927711

After imputation_test

HomePlanetCryoSleepDestinationAgeVIPRoomServiceFoodCourtShoppingMallSpaVRDeckCabin_deckCabin_sideID_numGroup_size
4267EarthTrue55 Cancri e3.000000False0.00.00.00.00.0GP11
4268EarthFalse55 Cancri e20.000000False0.0601.0103.035.00.0FS11
4269EarthTrueTRAPPIST-1e43.000000False0.00.00.00.00.0GS11
4270MarsFalseTRAPPIST-1e43.000000False47.00.03851.00.00.0DS11
4271EarthFalseTRAPPIST-1e40.000000False0.0865.00.03.00.0FS12
4272EarthTrueTRAPPIST-1e34.000000False0.00.00.00.00.0GS22
4273EarthFalseTRAPPIST-1e42.000000False0.0847.017.010.0144.0FS11
4274MarsTrue55 Cancri e33.213298False0.00.00.00.00.0DP11
4275EuropaFalseTRAPPIST-1e35.343316False0.02680.00.00.0523.0DP11
4276EarthTruePSO J318.5-2243.000000False0.00.00.00.00.0GS11

Duplicate rows

Before imputation test

HomePlanetCryoSleepDestinationAgeVIPRoomServiceFoodCourtShoppingMallSpaVRDeckCabin_deckCabin_sideID_groupID_numGroup_size# duplicates
Dataset does not contain duplicate rows.

After imputation_test

HomePlanetCryoSleepDestinationAgeVIPRoomServiceFoodCourtShoppingMallSpaVRDeckCabin_deckCabin_sideID_numGroup_size# duplicates
74EarthTrueTRAPPIST-1e18.0False0.00.00.00.00.0GP1111
80EarthTrueTRAPPIST-1e21.0False0.00.00.00.00.0GP119
82EarthTrueTRAPPIST-1e22.0False0.00.00.00.00.0GP119
67EarthTrueTRAPPIST-1e13.0False0.00.00.00.00.0GP117
69EarthTrueTRAPPIST-1e16.0False0.00.00.00.00.0GP117
78EarthTrueTRAPPIST-1e20.0False0.00.00.00.00.0GP117
180MarsTrueTRAPPIST-1e25.0False0.00.00.00.00.0FS117
29EarthTruePSO J318.5-2218.0False0.00.00.00.00.0GS116
31EarthTruePSO J318.5-2219.0False0.00.00.00.00.0GS116
76EarthTrueTRAPPIST-1e19.0False0.00.00.00.00.0GP116